Rank in Wordlist | Word | Rank in Wordlist | Word |
---|---|---|---|
1 | kuti | 26 | shoko |
2 | Mwari | 27 | vose |
3 | uye | 28 | akadaro |
4 | Jesu | 29 | zvinhu |
5 | kana | 30 | nokuti |
6 | asi | 31 | ainge |
7 | here | 32 | chinhu |
8 | vanhu | 33 | uyu |
9 | 2. | 34 | kuita |
10 | 1. | 35 | chete |
11 | sei | 36 | naMwari |
12 | kuna | 37 | wake |
13 | munhu | 38 | ari |
14 | raMwari | 39 | Bennie |
15 | pamusoro | 40 | Bill |
16 | kubva | 41 | iri |
17 | Zvino | 42 | aiva |
18 | Asi | 43 | rake |
19 | vake | 44 | izvi |
20 | Kana | 45 | iyi |
21 | apo | 46 | zvakanaka |
22 | akati | 47 | kwaari |
23 | mumwe | 48 | Mubvunzi |
24 | nguva | 49 | zvino |
25 | zvose | 50 | zvekare |
The table shows the top-50 words of the corpus. Usually we see stopwords.
Language: Afrikaans
This list is a good candidate for a first stopword list for a language.
Usually a small, balanced corpus is enough to get a good list of high frequent words. But if the small corpus has some very prominent topic, this will be visible even in the top word lists.
select w_id-100 as rank_in_wordlist, word from words where w_id>100 order by w_id limit 50;
3.4 Sample words for different frequency ranges